Skip to content

CSHARP-5935: Command activities may be skipped when using pooled connection#1918

Merged
ajcvickers merged 7 commits intomongodb:mainfrom
ajcvickers:TestFixes_2
Apr 13, 2026
Merged

CSHARP-5935: Command activities may be skipped when using pooled connection#1918
ajcvickers merged 7 commits intomongodb:mainfrom
ajcvickers:TestFixes_2

Conversation

@ajcvickers
Copy link
Copy Markdown
Contributor

@ajcvickers ajcvickers commented Mar 18, 2026

I was looking at flaky tests, and found MongoClient_should_create_activities_when_tracing_enabled was failing occasionally. Turns out it was an actual bug.

New fix by Claude to avoid tracking state.

Summary

I've successfully refactored the code to fix the bug where activity listeners were being checked when the connection was created (at which point none may be set), but not checked again when the connection was pulled from the pool.

Key Changes:

  1. Constructor (CommandEventHelper.cs:60): Removed the listener check from _shouldTrackState initialization. Now _shouldTrackState only depends on whether event tracking is needed (_shouldTrackSucceeded || _shouldTrackFailed), not on tracing with listeners.

  2. Property Getters (CommandEventHelper.cs:71-94): Updated all property getters (ShouldCallBeforeSending, ShouldCallAfterSending, etc.) to dynamically check MongoTelemetry.ActivitySource.HasListeners() at execution time, ensuring that activity tracing is detected when connections are obtained from the pool.

  3. Lazy State Initialization (CommandEventHelper.cs:37, 741-745): Changed _state from readonly to allow lazy initialization. The state dictionary is now only created when needed - either for event tracking or when tracing with listeners is detected at runtime.

  4. Null Safety (CommandEventHelper.cs:132, 165, 192, 216, 233): Added null checks before accessing _state in all methods to handle cases where state tracking hasn't been initialized.

  5. TrackCommandState (CommandEventHelper.cs:734-770): Updated to check for listeners at execution time and lazily initialize the state dictionary only when tracing is actually needed.

  6. Test Update (CommandEventHelperTests.cs:122-124): Updated the test to reflect the new behavior where _shouldTrackState only reflects event tracking, not tracing with listeners (which is now checked dynamically).

Benefits:

  • Bug Fixed: Activity listeners are now checked when connections are obtained from the pool, not just when they're created
  • No Unnecessary State Tracking: State tracking is only enabled when actually needed (either for events or when listeners are registered), avoiding the performance overhead of always tracking state
  • Backward Compatible: The behavior is the same for all scenarios, just more efficient

The changes ensure that the driver properly detects activity listeners whenever a connection is used, whether it's newly created or retrieved from the pool, while avoiding unnecessary state tracking when no listeners are registered.


Junie (Opus 4.6) says:

Root Cause

_shouldTrace in CommandEventHelper was set once at connection construction time via MongoTelemetry.ActivitySource.HasListeners(). Since connections are pooled and reused, if a connection was created before an ActivityListener was registered (or by a test with tracing disabled), _shouldTrace remained false permanently for that connection — command activities were never created even when a listener was later active.

@ajcvickers ajcvickers requested a review from adelinowona March 18, 2026 12:18
@ajcvickers ajcvickers requested a review from a team as a code owner March 18, 2026 12:18
Copilot AI review requested due to automatic review settings March 18, 2026 12:18
@ajcvickers ajcvickers added the bug Fixes issues or unintended behavior. label Mar 18, 2026
Comment thread tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/OpenTelemetryTests.cs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a tracing/instrumentation bug where command-level OpenTelemetry activities could be permanently skipped on pooled connections when an ActivityListener was registered after the connection was created.

Changes:

  • Update CommandEventHelper to decide whether to start command activities based on current ActivitySource listeners (instead of only at connection construction time).
  • Stabilize the OpenTelemetry smoke test by capturing activities thread-safely and waiting for async activity completion.
  • Minor smoke test / test infrastructure adjustments (key vault client URI selection; unobserved-exception tracking test case selection).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.

File Description
src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Adjusts tracing gating and activity creation logic for command events.
tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/OpenTelemetryTests.cs Makes activity capture thread-safe and waits for expected activities.
tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/LibmongocryptTests.cs Changes how the key vault MongoClient chooses its connection string.
tests/MongoDB.TestHelpers/XunitExtensions/TimeoutEnforcing/TimeoutEnforcingXunitTestAssemblyRunner.cs Avoids SingleOrDefault failure when multiple tracking test cases exist.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/LibmongocryptTests.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a pooled-connection tracing bug where command Activity creation could be skipped if an ActivityListener was registered after a connection was created, by checking for listeners at execution time and lazily initializing per-command state only when needed.

Changes:

  • Make tracing listener detection dynamic (ActivitySource.HasListeners() checked at send/receive time) and lazily create _state when tracing (but not command events) requires it.
  • Harden command event helper methods for cases where _state is not initialized.
  • Update smoke/unit tests to reduce flakiness and align expectations with the new state-tracking behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Refactors tracing/state tracking so pooled connections correctly create activities when listeners appear later.
tests/MongoDB.Driver.Tests/Core/Connections/CommandEventHelperTests.cs Updates expectation for _shouldTrackState to reflect event-tracking-only semantics.
tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/OpenTelemetryTests.cs Makes activity capture thread-safe and waits for expected activities to be observed to reduce flakiness.
tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/LibmongocryptTests.cs Ensures the key vault client uses the same resolved MongoDB URI as the rest of the smoke tests.
tests/MongoDB.TestHelpers/XunitExtensions/TimeoutEnforcing/TimeoutEnforcingXunitTestAssemblyRunner.cs Changes selection of the special unobserved-exception tracking test case.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/OpenTelemetryTests.cs Outdated
Comment thread tests/MongoDB.Driver.Tests/Core/Connections/CommandEventHelperTests.cs Outdated
Comment thread tests/SmokeTests/MongoDB.Driver.SmokeTests.Sdk/OpenTelemetryTests.cs Outdated
@ajcvickers ajcvickers requested a review from BorisDog April 2, 2026 15:48
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
if (_state.TryGetValue(message.RequestId, out state) &&
if (_state == null)
{
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this code path expected, under assumption that it's guarded by ShouldCallAfterSending() ?
If it's not, should we throw instead of return?

Copy link
Copy Markdown
Contributor Author

@ajcvickers ajcvickers Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because there is always a race condition when the listener is not set before we start, but is set later. This is fine, it just means there needs to be a null check.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, in a situation where ShouldCallBeforeSending == false and ShouldCallAfterSending == true.
Thanks!

public void ConnectionFailed(ConnectionId connectionId, ObjectId? serviceId, Exception exception, bool skipLogging)
{
if (!_shouldTrackFailed && !_shouldTrace)
if (!_shouldTrackFailed && !ShouldTraceWithActivityListener())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not in scope of this PR, but we are already here: Should we be consistent with same pattern here as in methd ShouldCallAfterReceiving => AfterReceiving()?
Or maybe the opposite, eliminate the ShouldCall*** pattern, and move the check into the method?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really sure what you are suggesting here. Could you explain more?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently ConnectionFailed is always called, and the check !_shouldTrackFailed && !ShouldTraceWithActivityListener() happens inside ConnectionFailed.
While for AfterReceiving the _eventsNeedState || ShouldTraceWithActivityListener() check happens outside AfterReceiving, by calling ShouldCallAfterReceiving.

Should we have the same pattern for all methods?

If (condition)
   CallMethod

or

CallMethod() { 
 if (not condition)
   return
}

Copy link
Copy Markdown
Contributor

@BorisDog BorisDog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a minor question pending. But it's not a blocker.
Everything else LGTM!

if (_state.TryGetValue(message.RequestId, out state) &&
if (_state == null)
{
return;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see, in a situation where ShouldCallBeforeSending == false and ShouldCallAfterSending == true.
Thanks!

public void ConnectionFailed(ConnectionId connectionId, ObjectId? serviceId, Exception exception, bool skipLogging)
{
if (!_shouldTrackFailed && !_shouldTrace)
if (!_shouldTrackFailed && !ShouldTraceWithActivityListener())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently ConnectionFailed is always called, and the check !_shouldTrackFailed && !ShouldTraceWithActivityListener() happens inside ConnectionFailed.
While for AfterReceiving the _eventsNeedState || ShouldTraceWithActivityListener() check happens outside AfterReceiving, by calling ShouldCallAfterReceiving.

Should we have the same pattern for all methods?

If (condition)
   CallMethod

or

CallMethod() { 
 if (not condition)
   return
}

Comment thread src/MongoDB.Driver/Core/Connections/CommandEventHelper.cs Outdated
…ection

I was looking at flaky tests, and found MongoClient_should_create_activities_when_tracing_enabled was failing occasionally. Turns out it was an actual bug. Junie (Opus 4.6) says:

### Root Cause

`_shouldTrace` in `CommandEventHelper` was set **once** at connection construction time via `MongoTelemetry.ActivitySource.HasListeners()`. Since connections are pooled and reused, if a connection was created before an `ActivityListener` was registered (or by a test with tracing disabled), `_shouldTrace` remained `false` permanently for that connection — command activities were never created even when a listener was later active.
I've successfully refactored the code to fix the bug where activity listeners were being checked when the connection was created (at which point none may be set), but not checked again when the connection was pulled from the pool.
Key Changes:
Constructor (CommandEventHelper.cs:60): Removed the listener check from _shouldTrackState initialization. Now _shouldTrackState only depends on whether event tracking is needed (_shouldTrackSucceeded || _shouldTrackFailed), not on tracing with listeners.
Property Getters (CommandEventHelper.cs:71-94): Updated all property getters (ShouldCallBeforeSending, ShouldCallAfterSending, etc.) to dynamically check MongoTelemetry.ActivitySource.HasListeners() at execution time, ensuring that activity tracing is detected when connections are obtained from the pool.
Lazy State Initialization (CommandEventHelper.cs:37, 741-745): Changed _state from readonly to allow lazy initialization. The state dictionary is now only created when needed - either for event tracking or when tracing with listeners is detected at runtime.
Null Safety (CommandEventHelper.cs:132, 165, 192, 216, 233): Added null checks before accessing _state in all methods to handle cases where state tracking hasn't been initialized.
TrackCommandState (CommandEventHelper.cs:734-770): Updated to check for listeners at execution time and lazily initialize the state dictionary only when tracing is actually needed.
Test Update (CommandEventHelperTests.cs:122-124): Updated the test to reflect the new behavior where _shouldTrackState only reflects event tracking, not tracing with listeners (which is now checked dynamically).
Benefits:
Bug Fixed: Activity listeners are now checked when connections are obtained from the pool, not just when they're created
No Unnecessary State Tracking: State tracking is only enabled when actually needed (either for events or when listeners are registered), avoiding the performance overhead of always tracking state
Backward Compatible: The behavior is the same for all scenarios, just more efficient
The changes ensure that the driver properly detects activity listeners whenever a connection is used, whether it's newly created or retrieved from the pool, while avoiding unnecessary state tracking when no listeners are registered.
@ajcvickers ajcvickers merged commit e11bcc8 into mongodb:main Apr 13, 2026
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Fixes issues or unintended behavior.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants